96 research outputs found
Recurrent kernel machines : computing with infinite echo state networks
Echo state networks (ESNs) are large, random recurrent neural networks with a single trained linear readout layer. Despite the untrained nature of the recurrent weights, they are capable of performing universal computations on temporal input data, which makes them interesting for both theoretical research and practical applications. The key to their success lies in the fact that the network computes a broad set of nonlinear, spatiotemporal mappings of the input data, on which linear regression or classification can easily be performed. One could consider the reservoir as a spatiotemporal kernel, in which the mapping to a high-dimensional space is computed explicitly. In this letter, we build on this idea and extend the concept of ESNs to infinite-sized recurrent neural networks, which can be considered recursive kernels that subsequently can be used to create recursive support vector machines. We present the theoretical framework, provide several practical examples of recursive kernels, and apply them to typical temporal tasks
One step backpropagation through time for learning input mapping in reservoir computing applied to speech recognition
Recurrent neural networks are very powerful engines for processing information that is coded in time, however, many problems with common training algorithms, such as Backpropagation Through Time, remain. Because of this, another important learning setup known as Reservoir Computing has appeared in recent years, where one uses an essentially untrained network to perform computations. Though very successful in many applications, using a random network can be quite inefficient when considering the required number of neurons and the associated computational costs. In this paper we introduce a highly simplified version of Backpropagation Through Time by basically truncating the error backpropagation to one step back in time, and we combine this with the classic Reservoir Computing setup using an instantaneous linear readout. We apply this setup to a spoken digit recognition task and show it to give very good results for small networks
Memory in reservoirs for high dimensional input
Reservoir Computing (RC) is a recently introduced scheme to employ recurrent neural networks while circumventing the difficulties that typically appear when training the recurrent weights. The ‘reservoir’ is a fixed randomly initiated recurrent network which receives input via a random mapping. Only an instantaneous linear mapping from the network to the output is trained which can be done with linear regression. In this paper we study dynamical properties of reservoirs receiving a high number of inputs. More specifically, we investigate how the internal state of the network retains fading memory of its input signal. Memory properties for random recurrent networks have been thoroughly examined in past research, but only for one-dimensional input. Here we take into account statistics which will typically occur in high dimensional signals. We find useful empirical data which expresses how memory in recurrent networks is distributed over the individual principal components of the input
Memristor models for machine learning
In the quest for alternatives to traditional CMOS, it is being suggested that
digital computing efficiency and power can be improved by matching the
precision to the application. Many applications do not need the high precision
that is being used today. In particular, large gains in area- and power
efficiency could be achieved by dedicated analog realizations of approximate
computing engines. In this work, we explore the use of memristor networks for
analog approximate computation, based on a machine learning framework called
reservoir computing. Most experimental investigations on the dynamics of
memristors focus on their nonvolatile behavior. Hence, the volatility that is
present in the developed technologies is usually unwanted and it is not
included in simulation models. In contrast, in reservoir computing, volatility
is not only desirable but necessary. Therefore, in this work, we propose two
different ways to incorporate it into memristor simulation models. The first is
an extension of Strukov's model and the second is an equivalent Wiener model
approximation. We analyze and compare the dynamical properties of these models
and discuss their implications for the memory and the nonlinear processing
capacity of memristor networks. Our results indicate that device variability,
increasingly causing problems in traditional computer design, is an asset in
the context of reservoir computing. We conclude that, although both models
could lead to useful memristor based reservoir computing systems, their
computational performance will differ. Therefore, experimental modeling
research is required for the development of accurate volatile memristor models.Comment: 4 figures, no tables. Submitted to neural computatio
Photonic Delay Systems as Machine Learning Implementations
Nonlinear photonic delay systems present interesting implementation platforms
for machine learning models. They can be extremely fast, offer great degrees of
parallelism and potentially consume far less power than digital processors. So
far they have been successfully employed for signal processing using the
Reservoir Computing paradigm. In this paper we show that their range of
applicability can be greatly extended if we use gradient descent with
backpropagation through time on a model of the system to optimize the input
encoding of such systems. We perform physical experiments that demonstrate that
the obtained input encodings work well in reality, and we show that optimized
systems perform significantly better than the common Reservoir Computing
approach. The results presented here demonstrate that common gradient descent
techniques from machine learning may well be applicable on physical
neuro-inspired analog computers
MACOP modular architecture with control primitives
Walking, catching a ball and reaching are all tasks in which humans and animals exhibit advanced motor skills. Findings in biological research concerning motor control suggest a modular control hierarchy which combines movement/motor primitives into complex and natural movements. Engineers inspire their research on these findings in the quest for adaptive and skillful control for robots. In this work we propose a modular architecture with control primitives (MACOP) which uses a set of controllers, where each controller becomes specialized in a subregion of its joint and task-space. Instead of having a single controller being used in this subregion (such as MOSAIC on which MACOP is inspired), MACOP relates more to the idea of continuously mixing a limited set of primitive controllers. By enforcing a set of desired properties on the mixing mechanism, a mixture of primitives emerges unsupervised which successfully solves the control task. We evaluate MACOP on a numerical model of a robot arm by training it to generate desired trajectories. We investigate how the tracking performance is affected by the number of controllers in MACOP and examine how the individual controllers and their generated control primitives contribute to solving the task. Furthermore, we show how MACOP compensates for the dynamic effects caused by a fixed control rate and the inertia of the robot
A differentiable physics engine for deep learning in robotics
An important field in robotics is the optimization of controllers. Currently, robots are often treated as a black box in this optimization process, which is the reason why derivative-free optimization methods such as evolutionary algorithms or reinforcement learning are omnipresent. When gradient-based methods are used, models are kept small or rely on finite difference approximations for the Jacobian. This method quickly grows expensive with increasing numbers of parameters, such as found in deep learning. We propose an implementation of a modern physics engine, which can differentiate control parameters. This engine is implemented for both CPU and GPU. Firstly, this paper shows how such an engine speeds up the optimization process, even for small problems. Furthermore, it explains why this is an alternative approach to deep Q-learning, for using deep learning in robotics. Finally, we argue that this is a big step for deep learning in robotics, as it opens up new possibilities to optimize robots, both in hardware and software
Automated design of complex dynamic systems
Several fields of study are concerned with uniting the concept of computation with that of the design of physical systems. For example, a recent trend in robotics is to design robots in such a way that they require a minimal control effort. Another example is found in the domain of photonics, where recent efforts try to benefit directly from the complex nonlinear dynamics to achieve more efficient signal processing. The underlying goal of these and similar research efforts is to internalize a large part of the necessary computations within the physical system itself by exploiting its inherent non-linear dynamics. This, however, often requires the optimization of large numbers of system parameters, related to both the system's structure as well as its material properties. In addition, many of these parameters are subject to fabrication variability or to variations through time. In this paper we apply a machine learning algorithm to optimize physical dynamic systems. We show that such algorithms, which are normally applied on abstract computational entities, can be extended to the field of differential equations and used to optimize an associated set of parameters which determine their behavior. We show that machine learning training methodologies are highly useful in designing robust systems, and we provide a set of both simple and complex examples using models of physical dynamical systems. Interestingly, the derived optimization method is intimately related to direct collocation a method known in the field of optimal control. Our work suggests that the application domains of both machine learning and optimal control have a largely unexplored overlapping area which envelopes a novel design methodology of smart and highly complex physical systems
- …